Reenacting Transactions to Compute their Provenance

نویسندگان

  • Bahareh Arab
  • Dieter Gawlick
  • Vasudha Krishnaswamy
  • Venkatesh Radhakrishnan
  • Boris Glavic
چکیده

Database provenance is essential for auditing, data debugging, understanding transformations, and many additional use cases. While these applications do benefit from state-ofthe-art provenance tracking for queries, most use cases also require provenance for transactional updates. We present the first provenance model for concurrent database transactions. Our model extends the well-known semiring provenance framework with version annotations and update operations. Based on this model, we present the first solution for computing the provenance of database transactions. Our approach can retroactively trace transaction provenance as long as an audit log and time travel functionality are available (both are supported by most DBMS) and without storing any additional information. For a given transaction, our approach constructs a reenactment query that simulates the effect of the transaction. This query is guaranteed to produce the updated versions of tables produced by the transaction and has the same provenance as the original transaction, i.e., it is annotation-equivalent. Using time travel and by adopting well-known techniques for computing the provenance of queries, we can use reenactment to retroactively compute the provenance of transactions. Currently, we support two widely applied concurrency control mechanisms: snapshot isolation and read committed snapshot isolation. We have implemented a prototype on-top of a commercial database system and our experiments confirm that 1) the runtime and storage overhead required to support time-travel and the audit log is tolerable and 2) by applying novel optimizations we can efficiently compute the provenance of large transactions over large data sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Formal Foundations of Reenactment and Transaction Provenance

Provenance is essential for auditing, data debugging, understanding transformations, and many additional use cases. All these use cases would benefit from provenance for transactional updates. We present a provenance model for snapshot isolation transactions extending the semiring framework with version annotations and updates. Based on this model, we present the first solution for computing th...

متن کامل

Computational provenance in hydrologic science: a snow mapping example.

Computational provenance--a record of the antecedents and processing history of digital information--is key to properly documenting computer-based scientific research. To support investigations in hydrologic science, we produce the daily fractional snow-covered area from NASA's moderate-resolution imaging spectroradiometer (MODIS). From the MODIS reflectance data in seven wavelengths, we estima...

متن کامل

A Generic Provenance Middleware for Database Queries, Updates, and Transactions

We present an architecture and prototype implementation for a generic provenance database middleware (GProM) that is based on the concept of query rewrites, which are applied to an algebraic graph representation of database operations. The system supports a wide range of provenance types and representations for queries, updates, transactions, and operations spanning multiple transactions. GProM...

متن کامل

Debugging Transactions and Tracking their Provenance with Reenactment

Debugging transactions and understanding their execution are of immense importance for developing OLAP applications, to trace causes of errors in production systems, and to audit the operations of a database. However, debugging transactions is hard for several reasons: 1) after the execution of a transaction, its input is no longer available for debugging, 2) internal states of a transaction ar...

متن کامل

Semantic Representation of Provenance in Wikipedia

Wikis are often considered as being a wide source of information. However, identifying provenance information about their content is crucial, whether it is for computing trust in public wiki pages or to identify experts in corporate wikis. In this paper, we address this issue by providing a lightweight ontology for provenance management in wikis, based on the W7 model. Furthermore, we showcase ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014